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INTRODUCTION 


Many unresolved questions on the physics of the solar wind and its 
effects on magneto spheric processes and cosmic ray propagation can be 
addressed with hourly averaged interplanetary plasma and magnetic field 
data. A wealth of such data has been accumulated for almost two decades. 
Recently, much of these data have been assembled onto a single magnetic 
tape available from NSSDC. 

The purposes of this report are: (1) to describe this composite data 

set - its content and extent, its sources, its limits of validity, and the 
mutual consistency studies and normalizations to which the input data were 
subjected, and (2) to present in the form of digital listings and 27-day 
plots hourly (or 3-hourly) averaged parameters. The listings are contained 
in the separately bound Appendix to this Data Book. 


DATA CONTENTS AND COVERAGE 

- The composite data set contains: (1) interplanetary magnetic field 

(IMF} vector data in geocentric solar ecliptic (GSE) and geocentric solar 
magnetospheric (GSM) coordinate systems, (2) interplanetary plasma param- 
eters, and (3) geomagnetic (K P and C 9 ) and solar (sunspot number R) activ- 
ity indexes. The interplanetary field and plasma data were all obtained 
by spacecraft in geocentric or selenocentxic orbit when those spacecraft 
were outside the Earth’s bow shock. The identifications of interplanetary 
periods for these spacecraft were made by the experimenters who supplied 
the data to NSSDC; these identifications are occasionally difficult to 
make. The geomagnetic and solar activity indexes were taken from a com- 
pilation prepared and periodically updated by the European Space Agency's 
European Space Operations Center and are described in Lenhort (1968) . 

The field parameters consist of field magnitudes, cartesian compo- 
nents, direction angles, and certain standard deviations. The plasma pa- 
rameters consist of bulk flow speed (V), proton density (N) , proton tem- 
perature (T), flow direction longitude (cj> v ) and latitude (8 V ), and certain 
standard deviations (O) . As is detailed below, not all the plasma param- 
eters were contained in each source data set. Thus, for some hours of 
the composite data set, only a subset of the identified plasma parameters 
axe given. 

The basic unit of time for the composite data set is 1 hour. All 
field data and much plasma data were available in the form of hourly aver- 
ages. For those source plasma data sets, identified below, in which only 
3-hour values are available, the 3-hour values were assigned to each of 
the 3 hours of the averaging interval. The 3-hour K P index and the daily 
C 9 and R indexes were treated similarly. For example, a given value of 
C 9 is repeated in 24 successive hourly records on the composite magnetic 
tape. 
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Although the details of data merging are given later , ' it is useful 
to note here the general outline of the procedure followed. All the 
source plasma data sets were combined onto a single, time-ordered magnet- 
ic tape. For any hour, there were data given separately from up to five 
sources. A similar composite IMF tape wa s also generated, with separate 
data from up to three spacecraft for each hour. A normalized composite 
plasma tape was generated from a slightly edited version of the tape with 
the unnormalized experimenter-supplied data. The composite IMF tape, the 
normalized composite plasma tape, and the tape containing the solar and 
geomagnetic activity indexes were merged to yield the final composite tape. 
The plasma parameters contained on the final tape for a given hour were 
selected from’ one of the possibly several sources .available for that hour. 
Field parameters were selected in a similar manner. Each of the tapes 
involved in the preparation of the final composite tape is available from 
NSSDC. 

The percent of coverage of the composite data set over the 1963 to 
1975 time period is shown in Figure 1 for each Bartels' solar rotation 
number. Of the 106,920 hours included in Bartels' solar rotation 1783 
to 1947, there are 45,399 hours with field and plasma data of which 23', 613 
hours have field and plasma data from a common spacecraft, 19,755 hours 
with field data only, 15,779 hours with plasma data only, and 25,987 hours 
with *no interplanetary plasma or field data. The time intervals of field 
and plasma data are Nov. 27, 1963, to Oct. 28, 1975, and Nov. 27, 1963, 
to Dec. 30, 1975, respectively. Of the 61,178 hourly records with plasma 
data, 29,160 records actually contain 3-hour averages. It is contemplated 
that this composite data- set will be updated as additional data become 
available. 


DATA SOURCES 


General 

All the source spacecraft used in compiling this composite data set 
are identified in Table 1 in chronological sequence. Each spacecraft is 
assigned a numeric and an alphabetic identifier. The numeric identifier 
represents how a given spacecraft is specified on the magnetic tape and 
in tables and figures in this document. The alphabetic identifiers are 
used in the listing found in The Data Book Appendix. In Table 1, it is 
indicated whether plasma and/or field data from a given spacecraft are 
used in this compilation. 

Plasma Data 

The 11 source data sets from which the plasma data of the new com- 
posite data set were obtained are listed in Table 2. For each source 
data set, the spacecraft, the principal investigator and his institution, 
the averaging interval, the time span, the number of hours on the final 
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Figure 1. Composite Data Set Temporal Coverage 



Table 1. Spacecraft Providing Interplanetary Medium Data 


Spacecraft 

Numeric 

Identifier 

Alphabetic 

Identifier 

Plasma 

Data 

Field 

Data 

Explorer IS (IMP 1, IMP A) 

18 

' ■ A ‘ 

X 

X 

Merged Vela (Vela 2-6) 

99 

‘ V 

X 


Vela 3 

3 

V 

X 


Explorer 28 (IMP 3 , imp c) 

28 

c 


X 

•Explorer 33 (AIMP 1, IMP D) 

33 

D 

X 

X 

Explorer 34 (IMP 4, IMP F) 

34 

F 

X 

X 

Explorer 35 (AIMP 2, IMP E) 

35 

E 

X 

X 

OGO 5 

5 

0 

1 

X 


HEOS 

1 

X 

X 

X 

Explorer 41 (IMP 5, IMP G) 

41 

G. 


X 

Explorer 43 (IMP 6, IMP I) : 

43 

t 

1 

I 

j 

X 

X 

Merged IMP (IMP 6 - 8 ) 

98 

L 

X 


Explorer 47 (IMP 7, IMP H) 

47 ; 

H 


X ■ 

Explorer 50 (IMP 8 , IMP J) j 

50 

J 

X 

X 










Table 2. Source Plasma Data Set Characteristics 


Spacecraft 

P.I. (Institution) 

Averaging 
Time (Hours) 

Time 

Period 

Number 
of Hours 

T 

N 

V 

d v 

0v 


CTn 

Ov 

CT ^ 


Explorer 18 

Bridge (MIT) 

3 

11/27/63 - 2/22/64 

1,485 


20% 

10% 








Merged Vela 

Bame (LASL) 

3 

7/21/64 - 3/18/71 

10,273 



X 








Vela 3 

Bame (LASL) 

3 

7/26/65 - 11/13/67 

5,721 

10% 

25% 

5% 

1.6° 


X 

X 

X 

X 


Explorer 33 

Bridge (MIT) 

1 

7/6/66 - 9/23/69 

5,637 

X 

X 

X 

X 

X 

X 

X 

X 

X 

X 

Explorer 34 

Ogilvie (GSFC) 

1 

6/3/67 - 12/16/67 

2,282 

X 

10% 

3% 



X 

X 

X 



Explorer 35 

Bridge (MIT) 

1 

7/28/67 - 7/3/68 

3,642 

X 

X 

X 

1.5° 

.75° 

X 

X 

X 

X 

X 

0G0 5 

Neugebauer (JPL) 

1 

3/S/68 - 4/29/71 

2,564 

15% 

3% 

1% 

X 

X 






HBOS 1 

Bonetti (CNR, Italy) 

3 

12/11/68 - 4/15/70 

3,142 


X 

X 








Explorer 43 

Bane (LASL) 

1 

3/18/71 - 3/27/73 

7,998 

X 

X 

X 








Merged IMP 

Bane (LASL) 

3 

3/18/71 - 12/31/74 

8,539 

15% 

30% 

2% 








Explorer 50 

Bridge (MIT) 

1 

12/1/73 - 12/30/75 

9,895 

X 

X 

X 

X 

X 

X 

X 

X 

X 

X 




composite tape, and an identification of which plasma parameters were 
available are shown. The Vela 3 data set is a composite of data from the 
Vela 3A and Vela 3B satellites. The sources listed as Merged Vela and 
Merged IMP refer to data sets, generated by the LASL plasma physics team, 
that contain data from Velas 2, 3, 4, 5, and 6, and Explorers 43, 47, and 
50 (IMPs 6, 7, and 8), respectively. Because the LASL Explorer 43 data 
set has a time resolution of 1 hour, it has been used (after normalizing 
densities and deleting some suspicious hours) rather than the 3-hour res- 
olution Merged IMP data set, when possible. However, LASL personnel have 
normalized and edited their Explorer 43 data and have folded them into 
the Merged IMP data set. 

Note that an M X" is used in Table 2 to indicate the availability of 
some parameters, while, for others, estimates of uncertainties found in 
the literature are given. During a discussion of the mutual consistency 
studies carried out in assembling this composite data set, questions of 
the levels of reliability of various parameters will be further discussed. 

The number of hours, listed in Table 2, that each source data set 
contributes to the final composite data set is only a fraction of the 
available hours for that source data set. The fraction depends on the 
data selection priority scheme (discussed later) and the availability of 
simultaneous' data. The fraction ranges from 40 percent fox the Merged 
IMP set to 100 percent for the Explorer 18 and 50 sets. 

The bulk plasma parameters of each source data set were determined 
by each experimenter group by averaging over fine-time scale values of 
these bulk parameters. (The number of such values contributing to each 
hourly or 3-hour ly average is given on the final composite tape for all 
source data sets except those from Explorers 18 and Merged IMP.) Fine- 
time scale bulk parameters were derived from the spectral and directional 
distributions of sensor outputs along with sensor calibration information. 
An assumption of the nature of the governing particle distribution func- 
tion (e.g., convected isotropic Maxwellian distribution) was also made. 
Generally speaking, fine-time scale plasma parameter derivation has im- 
proved with time as spacecraft telemetry rates have increased, thereby 
permitting improved temporal, spectral, and directional resolution of 
sensor outputs. 

Because each experimental group providing data has generally used the 
same instrumentation repeatedly and the same parameter derivation tech- 
nique, the following discussion is of the data sources grouped by insti- 
tution. 

All the Massachusetts Institute of Technology (MIT) data have been 
obtained with modulated grid split-collector Faraday cups. The basic 
theory of these instruments is discussed in Bridge et al . (1960) . In the 
derivation of the parameters, it was assumed that the governing distribu- 
tion was a convected isotropic Maxwellian or a convected isotropic Kappa 
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distribution. The latter distribution is a variation of the former with 
a high energy tail. The Explorer IS measurement sequence and some key re- 
sults are discussed in Bridge et at. (1965]), Olbert (1968), and Egidi et 
at. (1969). Explorer 33 details are given in Lyon et at, (1968), while 
Explorer 35 details are given in Lyon et at. (1967). For a discussion of 
the flow direction angle determination from Explorers 33 and 35, see Egidi 
et at. (1977). 

All the Los Alamos Scientific Laboratory (LASL) data have been ob- 
tained using hemispherical electrostatic analyzers for energy-per- charge 
selection and an electron multiplier for particle counting. Typically, 
a convected bi-Maxwellian distribution has been assumed in the bulk param- 
eter derivation. The single temperature contained in the LASL-supplied 
data sets is related to the perpendicular and parallel temperatures ac- 
cording to T = 1/3 (T]| + 2 Tj_ ). Further details on the LASL instruments 

and data are given in Hundhausen et at. (1967), Gosting et at. (1967), 

Bame et at. (1967) and Hundhausen et at. (1970) for Velas 2 and 3; Mont- 
gomery et at. (1970) and Hones et at. (1972) for Vela 4; Bame et at. (1971) 
for Velas 5 and 6; Feldman et at. (1973) for Explorer 43] and Asbridge et 
at. (1976) fox Explorers. 47 and 50. Discussions of the Merged Vela and 
IMP data sets are found in Gosling et at. (1976) and Feldman et at. (1976). 

The Goddard Space Flight Center (GSFC) Explorer 34 plasma instrumen- 
tation consisted of a curved plate electrostatic analyzer for energy-per- 
charge selection, followed by a crossed electric field/magnetic field de- 
vice (Wein filter) for velocity selection, followed by a particle counter. 
Plasma parameters were derived by taking moments of the observed distri- 
bution function. Further details on the instrumentation and data analysis 
are found in Ogilvie et at. (1968a), Oglivie et at. (1968b), and Burlaga 
and Oglivie (1968) . 

The Jet Propulsion Laboratory (JPL) OGO 5 plasma instrumentation con- 
sisted of a modulated grid Faraday cup and a curved plate electrostatic 
analyzer. Plasma parameters were determined iteratively by appropriately 
combining the outputs of the two sunward- looking sensor systems. Details 
are provided in Neugebauer (1970) . Because of its greater reliability, 
the total charge density obtained from the Faraday cup flux, rather than 
the ion flux inferred from the electrostatic analyzer, is given in the new 
composite data set. It is of interest to note that, except for the atti- 
tude-stabilized OGO 5, all the spacecraft providing plasma data for the 
new composite data set were spin stabilized. 

The Consiglio Nazionale delle Richerche (CNR, Italy) HEOS 1 plasma 
instrumentation consisted of a hemispherical electrostatic analyzer fol- 
lowed by a Faraday cup. A convected isotropic Maxwellian distribution 
function was assumed in the plasma parameter derivation. Details are pro- 
vided in Bonetti et at. (1969). Biodato et at. (1975) have presented 
listings of 3-hour averaged bulk speeds and densities from Vela 3, Ex- 
plorers 33, 34, and 35, and HEOS 1. The listed averages consist of com- 
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bined data from as many spacecraft as were available for each 3-hour aver- 
aging period. Before averaging, the data for each data set were normal- 
ized to Vela 3 values, using the results of the Moreno and Si.gnori.nl (1973) 
regression analysis. As input to our composite data set, only those Dio- 
dato-listed, 3-houx averages resulting from HEOS 1 only were taken, and 
they have been denormalized. That is, the inverse of the previously used 
normalization equations were applied. 

Magnetic Field Data 

The 10 IMF source data sets are listed in Table 3. All were provided 
by N. F. Ness and colleagues at GSFC, except the HEOS data set, which is a 
merged HEOS 1/HE0S 2 data set provided by P. C. Hedgecock of Imperial Col- 
lege, London. All the source data sets consisted of 1-hour averages ob- 
tained from fluxgate magnetometer data. All the magnetometers were tri- 
axial except those on Explorers 18 and 28, which were biaxial. All but 
the Explorers 18 and 28 and HEOS magnetometers were flippable to assist in 
sensor zero-level determinations. See Hedgecock (1975a) for a discussion 
of zero-level determination in the absence of sensor flip capability. 

Sensor signal digitization resolution was typically between 0.1 and 0.2 
gamma. Estimated upper limits of spacecraft magnetic fields at magnetom- 
eter locations (ends of booms) ranged from .5 gamma for early spacecraft 
to .1 gamma or less for recent spacecraft. 

The parameters available in the source data sets consist of hourly 
averaged field cartesian components in solar ecliptic coordinates, the 
magnitude and direction angles of the field vector made up by these three 
average cartesian components, and the averaged field magnitude. For the 
HEOS data set, hourly averaged direction angles and the standard deviations 
in the averaged magnitude and direction angles were also given. For the 
Explorer data sets, standard deviations in the cartesian component averages 
and, for all but Explorers 33, 34, and 35, in the field magnitude average 
were also given. Field components in GSM coordinates were computed at 
NSSDC from GSE components, as will be discussed. 

Hourly averaged values were constructed from fine-time scale field 
values (obtained either by measurement or by averaging yet finer scale 
data). The fine-time scale was 327 s for Explorers 18 and 28, 48 and 32 s 
for HEOS 1 and 2, and between 1 and 5 s for the remaining Explorers. The 
327-s resolution field magnitudes are the magnitudes of field vectors 
made up of 327-s averaged cartesian components. Thus, field directional 
fluctuations with frequencies between 5 s and 327 s will cause Explorer 
18 and 28 hourly averaged magnitudes to be somewhat smaller than corre- 
sponding averaged magnitudes based on 1- to 5-s resolution magnitudes. 

Much of the IMF data of the new composite data set was already pre- 
sented, in GSE components only, in King (1975). The present compilation 
supersedes that earlier document. 
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Table 3. Source Magnetic Field Data Set Characteristics 


Spacecraft 

Time Period 

Number 
of Hours 

Reference 

Explorer 18 

11/27/63 - 2/15/64 

1,215 

Ness et al. , 1964 

Explorer 28 

5/30/65 - 1/29/67 

6,233 

Ness et al. , 1964 

Explorer 33 

7/4/66 - 7/13/68 

8,032 

Behannon, 1968 

Explorer 34 

5/26/67 - 12/27/68 

5,388 

Fairfield , 1969 

Explorer 35 

7/26/67 - 11/10/69 

2,825 

Ness et al. , 1967 

HEOS 

12/11/68 - 10/28/75 

15,139 

Eedgecock , 1975b 

Explorer 41 

6/21/69 - 10/26/72 

7,373 

Fairfield and Ness , 1972 

Explorer 43 

3/13/71 - 7/21/74 

8,690 

Fairfield , 1974 

Explorer 47 

9/26/72 - 4/3/73 

1,645 

Mcs/ 2 an<3 Lepping , 1976 

Explorer 50 

10/29/73 - 8/26/75 

8,114 

Mish and Lepping, 1976 




MUTUAL CONSISTENCY 


General 

In the creation of the composite interplanetary medium data set, we 
have examined the mutual consistency of the source data sets. For the 
plasma data, consideration of regression analysis results and visual in- 
spection of corresponding scatter plots yielded normalization equations 
that were applied to some of the experimenter- supplied parameter values. 

In this section, the regression analysis used is described and the results 
are discussed for plasma data and for field data, A series of sample scat- 
ter plots, found in the back sections of this document, are discussed, and 
the plasma parameter normalizations utilized are listed. The limits of ac- 
curacy of the various parameters in this composite data set are also dis- 
cussed. 

A linear regression analysis, in which equal random error is assumed 
in both variables, was applied to the simultaneously determined data of 
several pairs of spacecraft. See, for example, Madansky (1959) for de- 
tails. This approach was chosen because both data sets do have random 
error in fact and because the unavoidable chaining of regression equations 
for spacecraft A/B and B/C to obtain A/C relations is more legitimate with 
this approach. 

This approach is in contrast to the more often used approach that 
assumes no error in the "independent variable." See, for example, Neuge- 
bauer (1976) and Moreno and Signori-ni (1973). (The present data have been 
run through a no-error-in- the-independent-variable regression analysis, 
and regression parameter values were found that are more nearly similar 
to those of Neugebauer and of Moreno and Signorini than are the following 
parameter values.) 

In the present analysis, the regression parameters a and b in the 
equation 


PSI ~ a ^52 + b 

(where si and S 2 denote the two source data sets, and P identifies the phys- 
ical parameter) are determined in a way geometrically equivalent to mini- 
mizing the sum of squares of perpendicular distances between data points 
and regression line. 

Before proceeding, it should be noted that, although differences be- 
tween data sets are emphasized in the following discussion, the level of 
agreement is really very high considering that the data were obtained and 
processed at different times using differing instrumentation flown on 
various spacecraft by various principal investigators and their colleagues. 
The high levels of agreement attest to the skill and care with which the 
data were acquired and processed. 
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Plasma Data 


The results of the regression runs for the logarithms of temperature 
and density and for the bulk speed are presented in Figures 2, 3, and 4. 
Logarithms were chosen for temperature and density because their distribu- 
tions were more Gaussian than were the linear values of temperature and 
density. Each figure shows several regression lines and their equations. 
Also shown for each line is a slope range that would correspond to 95 per- 
cent confidence limits in the absence of autocorrelations in the time se- 
ries being regressed (and to somewhat lower confidence limits in the pre- 
sence of such autocorrelations) . In addition, the root -mean- square per- 
pendicular distance (a x ) between data points and regression line is listed 
and is plotted with the center at the position of the average value on the 
regression line. The number of hours folded into each regression run and 
the distribution of values found in the composite data set after normali- 
zation are also given in these figures. 

The slope ranges of these figures make it clear that there is a sta- 
tistically significant difference from unity in the slope values for many 
pairs of spacecraft. Further, a regression equation whose slope is con- 
sistent with unity, but whose intercept is comparable to or larger than 
the listed xoot-mean-square perpendicular distance, is also statistically 
inconsistent with y = x. 

The distributions of values in the final composite data set, indicated 
in Figures 2, 3, and 4, are not identical to the distributions in the de- 
termination of any one of the regression lines, but they can be used as 
measures of the region of parameter space from which the data points were 
taken and outside of which the regression lines are meaningless. The per- 
centages of hours with temperatures, densities, and bulk speeds lying out- 
side the ranges in Figures 2, 3, and 4 are 0.4, 0.8, and 2.9, respectively. 

The hours included in the determination of the regression equations 
in Figures 2, 3, and 4 were all hours of simultaneous data, regardless of 
the number of fine-time scale points per hour and regardless of whether a 
given hourly value was, in fact, a 3-hour average. The difference in re- 
gression parameters obtained was examined with the restriction that each 
hourly average be comprised of at least three fine-time scale values a 
restriction used by some earlier workers. For most spacecraft pairs there 
were only negligible changes in the regression parameters; typically 1 to 
2 percent in slope and 1 to 10 percent in intercept and in root-mean-square 
perpendicular distance. However, for regressions of 0G0 5 data with Ex- 
plorers 33 and 35 data, some significant changes resulted. With the three 
fine-time-scale-points-per-hour restriction, the following equations were 
obtained 
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LOGARITHM OF TEMPERATURE (log T) 


T 


T 


T 


5.75 


5.50 


5.25 


5.00 


4.75 


4.50 


4.25 












Number 










°l 

of Hours 

1. 

log 

T 31 . 

SS 

(1.35 

± .16) 

log T 33 

. 

1.66 

.12 

205 

2. 

log 

T 3 5 

= 

(0.88 

± .05) 

log T 33 

+ 

0.65 

.15 

1306 

3. 

log 

t 3 

= 

(1.21 

+ .13) 

log T 33 

- 

0.97 

.20 

588 

4. 

log 

t 5 


(1.19 

± .11) 

log T 33 

- 

0.93 

.18 

675 

5. 

log 

t 3S 

= 

(0.72 

± .03) 

log T 34 

+ 

1.32 

.13 

1538 

6 . 

log 

t 35 

= 

(0.82 

± .09) 

log T s 

+ 

0.87 

.15 

293 

7. 

log 


= 

(0.89 

± .09) 

log. T s 

+ 

0.67 

.13 

130 

8 . 

log 

T 50 

= 

(0.90 

± . 01 ) 

log T 9e 


0.56 

.08 

5297 

9. 

log 

t „ 3 

= 

(1.04 

± . 01 ) 

log T 96 

- 

0.21 

.06 

8364 



9 3 4 1 

4.00 4.25 


4.50 


4.75 


5 00 5.25 


5.50 


LOGARITHM OF TEMPERATURE (log T) 

Figure 2. Plasma Temperature Regression Results 
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Figure 3. piasma Density Regression Results 
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FREQUENCY OF 
OCCURRENCE 
(percent) 


BULK SPEED (Km/s) 



FREQUENCY OF 
OCCURRENCE 
(percent) 



Number 
of Hours 


log Ts 

=5 

(1.08 

+ 

.09) 

log T 33 

- 0.37 

0.17 

547 

log T 35 

=S 

(0.91 

+ 

. 10 ) 

log Ts 

+• 0.45 

0.11 

269 

log N s 


( 1.12 


.04) 

log N 33 

+ 0.01 

0.08 

547 

log N 35 


( 0.86 

+ 

.03> 

log Ns 

+ 0.03 

0..05 

269 

v 5 

=S 

( 1.00 

+ 

. 02 ) 

V 33 

- 31.8 

12.7 

547 

V 35 


(1.08 

+ 

.03) 

v 5 

- 11.2 

11.4 

269 


These are to be compared to the appropriate equations in Figures 2, 3, 
and 4. No significant change in the 0G0 5/HEOS regression parameters was 
found. 

Note that no regressions involving flow directions are presented, nor 
are flow direction angles listed or plotted in this Data Book. This is be- 
cause such direction angles are given only in a small number of the source 
data sets (cf. Table 2 )-, and because the potential error of measurement 
relative to the expected range of flow angles is significantly larger than 
the relative error in other plasma parameters. Flow direction angles, as 
received from the experimenters, <are found on the magnetic tape from which 
this Data Book was created. 

Scatter plots 1-4, 5-8, and 9-12' are plots for the logarithm of tem- 
perature, logarithm of density, and bulk speed, respectively. These cor- 
respond to a representative portion , of the regression analysis results 
summarized in Figures 2, 3, and 4. Based on some preliminary scatter plots 
and other considerations, a modest number (less than 1 percent) of ques- 
tionable experimenter- supplied data hours from Explorer 34, 35, 43, and 
50 were eliminated from the composite data set before the reported regres- 
sion runs were made. 

Note the anomalously low slope in the V 3 $ versus V 3 i* bulk speed data 
between about 330 and 380 km/s. This anomaly also occurs in V 35 versus 
V 33 and V 35 versus V 5 scatter plots (not shown) , but in no other scatter 
plots. The conclusion is that the anomaly is in the Explorer 35 data. No 
special allowance was made for this apparent Explorer 35 anomaly in cre- 
ating the composite data set. 

Consider the spread of data points about the regression line. This 
variance may arise from a number of sources related to the instruments, 
the plasma parameter derivation from sensor outputs, the inadvertent in- 
clusion of averages affected by terrestial or lunar effects, and the solar 
wind variability itself. This latter effect may be significant insofar 
as two spacecraft may be measuring at least partly different plasma re- 
gimes during a given hour (or 3-hour interval) because of their differing 
spatial locations and/or their sampling at differing portions of an aver- 
aging interval. Recall that two source spacecraft contributing to this 
composite data set may be separated by a few tens of Earth radii (R E ) in 
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a solar wind flowing at 200 - 300 Re/h. This spacecraft separation ef- 
fect was not accounted fox in building the composite data set, a fact 
that yields a slight "fuzziness" in the very concept of the hourly aver- 
aged value of an interplanetary parameter for the Earth. That this ef- 
fect of solar wind variability is not the dominant cause of data point 
scatter is suggested by the fact that the point spread for the 50/98 (Ex- 
plorer 50/Merged IMP) regression lines for all 3 plasma parameters is 
small relative to the spread for other spacecraft pairs, even though the 
50/98 regression involves 1-hour (50) and 3-hour (98) averages. 

However, whatever the source of the spread of data points, this 
spread, rather than quoted errors in individually determined plasma param- 
eters, determines the limits of validity of the composite data set created 
by interspersing normalized hourly (or 3-hour ly) averages from many space- 
craft. Based on root -me an- square perpendicular distances between data 
points and regression lines, as listed in Figures 2, 3, and 4, irreducible 
uncertainties in temperatures are estimated as » 40 percent early (5= 1971) 
and « 20 percent late (w 1971), in densities as ~ 20 percent early and 
ss 10 percent late, and in speeds as 15 km/s early and 10 km/s late. 

Thus, for example, it is estimated that the probability that any giv- 
en (normalized) early-period temperature value is in error by more than 
40 percent is » 0.32, which is the probability that a sample point taken 
from a Gaussian distribution lies more than one a away from the population 
mean value. 

Having examined the irreducible variance in the composite data set, 
it is desirable to normalize the source data sets to the extent that a 
significant improvement in mutual consistency may be achieved. A compli- 
cating factor in the attempt to find appropriate normalizations is that 
there are many pairs of overlapping spacecraft. Each of several of these 
pairs is not independent of combinations of other pairs. For example, 
regression analyses have been run for Explorers 33/34, 33/35, and 34/35. 
These runs involved 205 common 33/34 hours between days 236 - 257 of 1967, 
1306 common 33/35 hours between day 236 of 1967 and day 98 of 1968, and 
1538 common 34/35 hours between days 205 - 344 of 1967. Combining the 
33/34 and 33/35 results (cf. Figures 2, 3, and 4) to infer 34/35 relations 
and comparing these to the directly obtained 34/35 relations yields rea- 
sonable consistency in temperature and bulk speed, but a poor measure of 
consistency in density. 

Inferred Observed 


log T35 - .65 log T34 + 1.73 

log N 35 = .84 log N 34 + 0.13 

V 35 = .93 V 34 + 40.0 


log T 35 - .72 log T 34 + 1.32 

log N 35 = 1.01 log N 34 + 0.04 

V 35 = 0.97 V34 + 22.3 


To some extent, this apparent discrepancy may arise from the differ- 
ent time periods over which these data were taken, combined with some un- 
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detected or inadequately treated temporal dependence in sensor character- 
istics. Although this possibility is examined in more detail in connec- 
tion with magnetic fields, temporal variation in sensor characteristics 
has generally been neglected in this data compilation, with some minor 
exceptions to be noted later. It is of interest that upon comparing in- 
ferred and observed 34/35 relations based on only 166 hours of simulta- 
neously available 33, 34, and 35 data, a high level of consistency was 
found. 

The possibility of seasonal variation in the Explorer 33/35 regres- 
sions, which might arise from the fact that the Explorer 33 spin vector 
lay in the ecliptic plane, was examined. (All subsequently launched Ex- 
plorer spacecraft used in this compilation had spin vectors normal to the 
ecliptic plane.) For days 236 - 255 of 1967, and- then days 18 - 98 of 
1968, it was found that 

Number 
a L of Hours 

1967 log T 3S = (0.89 ± .06) log T 33 + 0.59 0.15 1061 

log N 35 = (0.91 ± .02) log N 33 - 0.08 0.05 

V35 = (1.04 ± .01) V33 - 29.3 12.8 

1968 log T 35 = (0.67 ± .10) log T 33 + 1.62 0.11 245 

log N 3S = (1.04 ± .03) log N 33 - 0.02 0.03 

V 3S = (1.08 ± .04) V 33 - 42.1 8.6 

There are apparently significant changes in these two subsets of the Ex- 
plorer 33/35 data. Nevertheless, because inspection of the. appropriate 
scatter plots revealed that the 245 data points for 1968 populate a re- 
gion of parameter space (for each of the- three parameters) entirely pop- 
ulated by some of the 1061 points for 1967, the choice was made to neglect 
seasonal variations in performing plasma parameter normalizations. Fur- 
ther, since the 33/35 regression parameters for all 1306 points of 1967 
to 1968 are close in value to the corresponding 33/35 parameters for the 
1061 points of 1967 only, it appears that seasonal variation is not re- 
sponsible for the previously discussed discrepancy in observed and in- 
ferred 35/34 regression parameters. 

Because of many uncertainties in the analysis (dependence of regres- 
sion results on data subset used, occurrence of spurious points despite 
attempts to eliminate such points, possible time dependence in spacecraft 
and/or sensor characteristics, autocorrelations in time series being re- 
gressed, etc.), and because of the need to perform normalizations simul- 
taneously and consistently for many overlapping data sets, it was decided 
to normalize temperatures and densities only when visually better fits to 
y = x in scatter plots could be achieved. Bulk speed data have been nor- 
malized on the basis of the regression equations in Figure 4, as discussed 
further below. 
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The normalized parameter values T n ., N n , and V n found in the final 
composite data set are related to the experimenter-supplied values through 
the normalization equations listed in Table 4. Many points deserve note: 

1. The Explorer 18 densities and speeds overlapped with no 
other available data and could not be normalized. Their 
limits of validity, relative to the rest of the data set, 
is uncertain. 

2. The "Merged Vela" speeds were separately considered for 
periods before and after Jan. 1, 1968, and different 
normalization equations were chosen. 

3. The early data (sources 1, 3, 5, 33, 34, 35, and 99) were 
normalized independently of the later data (sources 43, 

50, and 98). In the later period, the 43 and 50 data 
were normalized to the 98 data. This separation into 
early and late data follows from the fact that the only 
overlap between an early period source and a late period 
source is the set of 130 OGO 5/Explorer 43 common hours 
obtained in March and April of 1971. (cf. scatter plots 
3, 7, and 11.) Given the smallness of this number' of 
hours, and given the fact that the OGO 5 instrumentation 
was 3 years postlaunch during these hours, it did not 
seem justifiable to use the OGO 5/Explorer 43 regression 
to normalize the early and/or late period data to a 
common standard. This inability to make a dependable 
early/late normalization will probably not introduce any 
gross errors into the study of solar cycle variations 
with this data set; nevertheless, this point should be 
kept in mind in such studies. 

4. Many speeds have been normalized by relatively small 
amounts, because reliability in absolute speed values 
is important for studies attempting to link features 
at 1 AU with solar features. (Note that a 10 to 15 
km/s uncertainty in speed for an »400 km/s solar wind 
yields a solar source longitude uncertainty of 1.4° to 
2.1° with the frequently used constant radial velocity 
approximation.) In the early data (£ 1971), speed nor- 
malizations were chosen using the numerous regressions 
between Explorer 33 and other spacecraft. It was as- 
sumed that a weighted average of these equations would 
yield a relationship between Explorer 33 speeds and 
"true" speeds and that this relationship could then be 
used with the Explorer 33/ Spacecraft X regression 
result to yield a relationship between Spacecraft X 
speeds and "true" speeds. 
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Table 4. Normalization Equations Used 


Spacecraft 

Identifier 

log T n = 

log N n = 

ii 

£ 

18 

- 

log N 18 

V, B 

3 

.8 + .83 log T 3 

-.222 + 1.16 log N 3 

26 + .99 V 3 

99 (<'67) 

- 

- 

26 + .99 V 99 

99 (> ' 68 ) 

- 

- 

V 99 

33 

log T 33 

log N 33 

-44 + 1.05 V 33 

34 

1.2 + .75 log T 34 

log N 34 • • 

9 + .98 V 34 

35 

log T 35 

log N 35 

-8 + V 35 

5 

log T 5 

.9 log N 5 

-10 + 1.05 V 5 

1 

- 

log N, 

32 + .98 V, 

98 

log T 98 

log N 98 

v 98 

• 43 

log T 43 

.097 + log N 43 

V43 

50 

-.62 + 1.1 log T 50 

.121 + .89 log N 50 

V 50 
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5. There was. a certain amount of arbitrariness in arriving 
at the normalization equations given in Table 4. How- 
ever, for the most part, other reasonable choices of 
normalization parameters would lead to normalized param- 
eter values different from the values obtained by amounts 
less than the previously discussed intrinsic uncertain- 
ties in the composite data set* Nevertheless, the com- 
posite unnormalized solar wind tape is available from 
'NSSDC if the reader wishes to make a different normali- 
zation. 

Magnetic Field Data 

The regression analysis results for the interplanetary magnetic field 
data are given in Tables 5 and 6. The notation follows from an equation 
of the form 


Psi a ^S2 + 

where P denotes the parameter, si and S 2 ' identify the two spacecraft, and 
a is the slope and b is the intercept. Note that slope values are given 
with limits that would correspond to 95 percent confidence limits in the 
absence of autocorrelations and which, given the presence of some auto- 
correlations, in fact correspond to somewhat lower confidence limits. 

The column labeled cq gives the root -mean-square perpendicular distance 
between data points and the "best fit" regression line. Table 5 relates 
to field cartesian components (solar ecliptic coordinates), and Table 6 
relates to the average field magnitudes and to direction angles derived 
from averaged cartesian components. 

The units of the b and Ol columns are gammas and degrees, as appro- 
priate. In selecting Hourly averages for analysis, no restriction on the 
minimum number of fine-time scale points per hour was imposed. That there 
are fewer hours in the field longitude regression equation determination 
than for other parameters results from the exclusion of hours when |<f) si - 
<J> S2 | > 180°. (Such hours of, for example, (f> sl s 10° and <f> S2 ~ 350° are 
appropriate for inclusion in regression analysis for other parameters; 
that such hours were not included in the t}> regression, with (f> 51 set equal 
to 370° in the example given, is expected to introduce no bias in the <f> 
regression results.) 

Scatter plots 13-27 correspond to selected regression runs in Table 
5. From inspection of this table and those figures, several points may 
be made: 


1. The slopes are different from unity by several percent in 
many cases. This implies errors of several percent in ef- 
fective sensitivity factors in one or both spacecraft in- 
volved ( Ring and Ness, 1977). There is no unique and con- 
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Table 5. Regression Results for Field Cartesian Components 


P SJ = a P S2 + b 


SI 

S2 

Number of 
Points 

P 

a 

b 

°± 

28 

'33 

1183 

Bx 

1.00 ± .02' 

0.16 

M 




By 

0.93 ± .02 

-0.02 

mSm 




Bz 

1.04 ± .06 

-0.05 

1.55 

34 

33 

1497 

Bx 

0.94 ± .02 

0.03' 

0.83 




By 

0.96 ± .01 

-0.03 

0.77 




B z 

0.97 ± .03 

0.10 

0.97 

33 

35 

3040 

Bx 

'1.09 + .01 

■E9. 

•0.75 




By 

1.09 ± .01 

mmk-- 

0.74 




B z 

1.07 ± .02 


0.96 

34 

35 

3145 

Bx 

0.91 ± .03 

-0.17 

1.79 




By 

1.09 ± .01 

0.09 

0.83 




Bz 

0.91 ± .02 

0.61 

1.14 

r 

35 

1156 

, B x 

1.08 ± .02 

0.17 

1.01 




By 

1.09 ± .02 

-0.01 

1.07 




Bz 

1.01 ± .02 

0.72 

1.05 

41 

1 

2021 

Bx 

1.00 ± .01 

-0.09 

0.72 




By 

0.99 ± .01 

0.00 

0.70 




’ Bz 

0.99 ± .02 

-0.10 

0.66 

41 

43 

1424 

Bx 

1.03 ± .01 

-0.07 

0.50 




By 

1.01 ± .01 

0.05 

0.52 




Bz 

0.98+ .02 

0.05 

0.56 

47 

43 

755 

Bx 

1.05 ± .03 

0.00 

0.98 




By * 

-1.01 + .03 

0.13 

0.99 




Bz 

1.02 ± .04 

0.19 

1.40 

50 

43 

1657 

Bx 

0.89 ± .02 

0,15 

1.35 




By 

1.01 ± .02 

-0.06 

0.82 




Bz 

0.94 ± .02 

0. OS 

0.85 

1 

43 

4898 

Bx 

0.98 ± .01 

-0.03 

1.17 




' By 

1.00 ± .01 

0.03 

0.94 




Bz 

0.98 ± .01 

0.09 

0.90 


47 

1675 

Bx 

0.98 ± .02 

0.03 

1.00 




By 

1.01 ± .02 

0.03 

1.09 




Bz- 

0.99 ± .03 

-0.11 

0.91 


50 

3130 

Bx 

1.04 ± .01 

0.00 

0.94 




By 

0.97 ± .02 

0.10 

1.07 




Bz 

1.00 ± .02 

0.08 

0.88 
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Table 6. Regression Results for- Field Magnitude and Angles 


'Sl 


=' a P S2 + b 


SI 

S2 

Number of 
Points 

P 

28 

33 

1183 

‘B 



1183 

6 



■ 1130 


34 

33 

1497 

- B 



1497 

0 



1418 

4> 

33 

35 

3'040 

B> 



3040 

.0 



2930 


34 

35 

3,145 

B 



3145 

9 



2969 

<f> 

1 

35 

1156 ' ' 

B 



1156 

0 



1098 


41 

1 ■ 

•2021 

•B 



202i. 

-0 



1959 


41 

43 

1424 

'■ 'b 



1424 

0 



1381 

/<f> 

47 

43 

755- 

.B 



,755 

•0 


/ 

729 

' <}> 

50 

43 

' 1657 

B 



1657 

0 



1583 

4> 

1 

43 

4898 

B 



4898 

0 

■ 


4649 


m 

47 

' 1675 

.B 



1675 

0 



1584 

4> 

H 

50 

3130 

B 



3130 

'0 

; 1 


2927 

r <J) 


a 

b 


0,96 ± .02 

1 

1 

1.13 ± .08 

■ 

■ 

1.01 ± ,02 

■ 


0.97 ± .01 

- 0.01 


1.01 ± .04 

1.18 

12.14 

1.00 ± .01 

0,19 

15.24 

1.06 ± .01 

0.02 

0.40 

1.01 ± .01 

8.73 

11.92 

i“l 

o 

+1 

o 

o 

r— f 

- 0.19 

15.26 

0,84 ± .02 

0.99 

0.94 

. 0.99 ± .03 

7.91 

12.87 

0.98 + .01 

5.96 

20.09 

1.06 ± .03 

- 0.07 

0.97 

0.95 ± .04 

7.74 

13.06 

1.01 ± .02 

- 1.13 

17.10 

1.00 ± ,01 

0.06 

0.35 

1.00 ± .02 

- 1.10 

9.02 

0.99 ± .01 

4.88 

15.19 

1.01 ± .01 

- 0.03 

0.31 

0.98 + .02 

1.05 

8.32 

0.99 ± .01 

3.47 

11.44 

1.02 ± .01 

0.05 

0.30 

1.01 + .06 

2.70 

13.05 

, 1.01 + .02 

, - 0.76 

16.85 

0.80 ± .02 

1.21 

1.17 

1 . 02 ' ± .03 

1.31 

9.11 

1.00 ± .01 

- 4.05 

12.85 

0.93 ± .01 

0.38 

0.96 

1.00 ± . 02 • 

1.24 

9.47 

0.99 ± .01 

1.90 

15.02 

0.98 ± .01 

- 0.04 

0.41 

0.97 ± .03 

- 1.93 

12,02 

1.01 ± .01 

- 6.19 

18.53 

1.01 ± .01 

- 0.07 

0.35 

1.01 ± .02 

1.07 

11.11 

■ 1.00 ± .01 

3.59 

19.04 










sistent way to determine in which source data sets these 
apparent errors occur, despite the availability of sev- 
eral different spacecraft pairs. 

2. The root-mean-square perpendicular distance between data 
points and regression line is typically of the order of 
0.5 - 1.0 gamma. Variability between spacecraft pairs 
results from both differing widths of the main clusters 
of points as well as different numbers of far-outlying 
points. The factors yielding non-zero root-mean-square 
perpendicular distances are addressed in the preceding 
discussion of plasma data. The significance of these 
point spreads is that they yield the limits of validity 
of the corresponding parameter. Thus, a listed value of 
B x represents the "true" hourly averaged IMF B x component 
for Earth to within 0.5 to 1.0 gamma. 

3. The regression line intercepts are always less than 0.2 
gamma (and often less than 0.1 gamma) except for those 
involving B z as measured- by Explorer 35. It appears 
that the Explorer 35 B z values are too small (too nega- 
tive) by about 0.7 gamma, and that sensor zero levels 
for the other spacecraft involved in the regressions 
have been well determined. It is appropriate to note 
that in King (1975) it was found that, when B z was 
averaged over all available hours separately for each 
source data set, all such averages were within 0.2 y 
of zero, except for Explorer 18 (B z = -1.0 y based on 
1215 hours) and Explorer 35 (B z = -0.7 y) . 

Inspection of Table 6 reveals that typical uncertainties in the "true 11 
hourly averaged IMF magnitude, latitude, and longitude angles for Earth 
are «0.3 to 1.0 y, «10° to 15°, and j«15 0 to 20°, respectively. The pre- 
viously noted ~0.7 y offset in Explorer 35 B z is reflected in the s8° off- 
set in the 0 regression runs involving Explorer 35. Otherwise, intercepts 
for the 6 regressions are all reasonably close to zero. The intercepts 
for the <j) regressions exhibit a surprisingly large range of up to *6°, 
although there is no unique and consistent way to assign angle offsets to 
specific source data sets. 

It is apparent from Table 6 that the field magnitude regression slope 
has an unusually low value (0.80) for Explorer 50/Explorer 43. This is in 
contrast to the fact that Explorer 50/HE0S and Explorer 43/HEOS field mag- 
nitude regression slopes are both much closer to unity (1.01 and 0,93). 

A similar inconsistency is visible in the 34/33, 35/33, and 34/35 regres- 
sion runs. 

Given the availability of more than 3 years of overlapping HEOS/Ex- 
plorer 43 data, the possibility of time dependencies in regression param- 
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eters that might be,. at least partially responsible for such inconsisten- 
cies has been examined. The results are summarized in Table 7. Note that 
there is a slight trend for the field component regression line slopes to 
decrease with time, with a statistically significant decrease of about 10 
percent in ,1974 relative .to ,1973., Note the more dramatic variation in the 
field magnitude regression line slopes . This difference in the character 
of the temporal changes between field magnitude and field component regres- 
sion parameters is, at least in part, due to the application of the same 
regression analysis to dissimilar' 'distributions of field component values 
(quasi-normal) and field magnitude values (non- normal ) . 

, It appears that time variations in sensor characteristics may yield 
some inconsistencies in comparing results from apparently redundant triads 
o.f spacecraft paiprs. Nevertheless, it is difficult to uniquely assign time 
variations to specific source data; sets.. 

i 

Despite the present findings of regression line slopes different from 
unity, and some intercepts different from zero, no IMF data normalizations 
have .been performed, because the line y = x passes through the main cluster 
of IMF data points' on the scatter" plots shown (and on those not shown). 
Equivalently, the changes in parameters brought about by appropriate nor- 
malization lrould be less than the previously discussed uncertainties in 
these parameters. 

These mutual consistency results have been included to give the po- 
tential, data user, .both .quantitative and qualitative insight into the lim- 
its of validity of the composite data set. If the reader believes a spe- 
cific, study, would profit from data normalizations, it is advised that nor- 
malization be done. The magnetic tape containing up to three sets of IMF 
data from different spacecraft per hour is available from NSSDC if the 
reader wishes to test data mutual consistency in some manner other than 
that employed herein. 


DATA SELECTION 

The final composite data set was assembled from the composite IMF 
tape, the normalized composite plasma tape, and a tape with geomagnetic 
and solar activity indexes. For a given hour, the plasma and field data 
were each taken from one of possibly several available source data sets 
according to the following priority scheme. 

Plasma data were considered first. If the plasma spacecraft used 
for the preceding hour was available, and it'- had a 1-hour resolution, it 
was chosen. If the plasma spacecraft used for the preceding hour was not 
available, or if it had a 3-hour resolution, that source having at least 
three fine-time scale points per hour and having the highest priority was 
chosen. The priority ordering was (high to low) 33, 35, 34, 3, 50, 43, 5, 
1, 99, and 98, determined somewhat arbitrarily on the basis of available 
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Table 7. Regression Results fox HEOS/Explorer 43 Field Data 


P, = a P 43 + b 


Year 

Number of 
Po ints 

P 

a 

b 

Tl 

1971 

102 

Bx ' 

1.03 ± .06 

n 

M!W 



By 

1.06 ± .10 

1 m 

msm 



B z 

1.06 ± .13 

0.01 

0.75 



’ B 

0.95 ± .04 

0.47 

: 0.26 

1972 

1425 

Bx 

1.04 ± .02 

-0.08 ' 

• 0.79 



By 

1.02 ± .01 • 

' 0.'06 

0.78 ' 



B z 

1.02 ± .02 

0.02 

0.83 



B 

1.08 ± .02' 

1 { 

0.43 : ' 

0.75 

1973 

1592' 

B x 

1.01 ± .02 

-0.08 

0.90- 



' By • 

1.02 ± .02- 

-0.02 

> 0.98 



B z 1 

■ 1.02 ± .03 • 

- 0.02 ■ 

• 0J89 



B ’ • 

l.’Ol ± .01 

-0.08 

- 0433 

1974 

1779 

B x 

0.90 ± .03 

— 

1.57 



By ‘ 

‘ 0.96 ± .02 


1.01 



2 : 

•0.90 ± .03 


0.96 ' 


1 

' B : 

0.71 ± .05 ■ 

1.67 

1.27 
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parameters and temporal resolution. If no source was chosen using the 
just mentioned criteria, the fine-time-scale-points-per-hour criterion was 
dropped, and the same priority criterion was reapplied. 

Then IMF data were taken from the same spacecraft, if available, from 
which the plasma data were just chosen. However, if this spacecraft was 
not available, if it was Explorer 35, or if there were no plasma data for 
the current hour, IMF data were taken from the spacecraft providing IMF 
data for the previous hour. Again, if this spacecraft was not available 
or was Explorer 35, IMF data were taken from the highest priority space- 
craft available, according to the priority ordering (high to low) 50, 47, 
43, 1, 41, 34, 33, 28, and 35. Note that Explorer 35 IMF data appear in 
the final composite data set only for those 2825 hours when IMF data were 
available from no other source. 


FIELD COMPONENT TRANSFORMATIONS 

There are several orthogonal, right-handed coordinate systems in 
which interplanetary vector quantities are usefully expressed. In geo- 
centric solar ecliptic (GSE) coordinates, the X-axis points from the Earth 
to the Sun and the Z-axis is normal to the ecliptic plane, positive north- 
ward. Geocentric solar equatorial coordinates also have an X-axis pointing 
from the Earth to the Sun, but have a Y-axis lying in a plane parallel to 
the solar equatorial plane, positive in a direction roughly opposite that 
of planetary motion. In this system, which differs from the GSE system by 
7.25° at most, the ideal spiral magnetic field (Parker, 1958) has no Z com- 
ponent. In geocentric solar magnetospheric (GSM) coordinates, the X-axis 
again points from the Earth to the Sun, while the Z-axis lies in a plane 
containing the X-axis and the Earth's magnetic dipole axis and is positive 
northward. The GSM system is appropriate for studies of magnetospheric 
effects of IMF variations. See Russell (1971) for a more detailed discus- 
sion of these and other coordinate systems and the transformations among 
them. 


The solar wind flow direction angles were provided in GSE coordinates 
and- are contained on the composite tape in these coordinates only. The 
IMF data, given only in GSE cooi'dinates in the source data sets and in the 
predecessor to this Data Book (King, 1975) , are given in the present com- 
posite data set in both GSE and GSM coordinate systems. The required trans- 
formations were performed at NSSDC. 


IMF VECTOR STANDARD DEVIATION 

As indicated previously, standard deviations for hourly averages of 
various IMF parameters were made available in various source data sets. 
However, there was no parameter for which a standard deviation was given 
in all source data sets. 
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In order to have a consistent measure of field fluctuations for the 
composite data in the predecessor to this Data Book, a "vector standard 
deviation" was computed as (o Bx 2 + ov, 2 + c a z for Explorer data sets 
and as (a B 2 + B 2 a Q 2 + B 2 cos 2 0 a, 2 )'^ for HEOS records. In so far as 
these expressions represented the^lengths of the diagonals of "uncertain- 
ty elements" at the tips of the hourly averaged field vectors, they were 
taken to yield a quasi-homogeneous set of data when interspersed. 


However, it has subsequently been pointed out { 'Svalgaaxd , 1976) that 
the expression a Bx 2 + a By 2 + a Bz 2 is analytically equivalent to the ex- 
pression o B 2 + B 2 - F 2 , where B is the average field magnitude, o B its 
standard deviation, and F the length of the vector constituted by the aver- 
aged cartesian components. Accordingly, the vector standard deviation con- 
tained in the new. composite data set (tape and listings of this Data Book) 
is (tf Bx 2 + a B y 2 + 0 b z 2 )^ 2 f° r Explorer records and (a B + B 2 - F 2 )^ 2 for HEOS 
records . 


DATA PRESENTATION 

The composite interplanetary plasma/magnetic field data set has been 
assembled onto a single magnetic tape with one record for each hour of 
Bartels' solar rotations 1783 through 1947 (Nov. 2, 1963 to Jan. 12, 1976). 
The data found in a given record consist of a flag to indicate whether 
there are plasma and/or field data (or neither) for that hour, time infor- 
mation and Bartels' rotation number, identifiers for the plasma and field 
source spacecraft, numbers of fine-time scale points in the plasma and 
field averages, average field magnitude and GSE and GSM cartesian compo- 
nents , magnitude and latitude and longitude angles of the vector comprised 
by the GSE cartesian components, standard deviations in the average mag- 
nitude and in cartesian component averages (Explorer IMF data) or in field 
angle averages (HEOS IMF data) , field vector standard deviation (see pre- 
vious section for discussion of this parameter), proton temperature, pro- 
ton density, bulk flow speed and direction angles, standard deviations in 
the plasma parameters, geomagnetic activity indexes K P and C 9 , and the sun- 
spot number R. The initial flag, the time and solar rotation words, and 
the geomagnetic activity indexes and sunspot number words have meaningful 
values for all hours. Plasma (field) words are filled with zeros for hours 
when no plasma (field) data were available. In addition, individual words 
corresponding to parameters not provided in the source data set are also 
filled with zeros. This tape (which may be updated as warranted) is avail- 
able from NSSDC with a detailed format statement. 

The Data Book consists of graphical and tabular presentations of some 
of the parameters of the composite data set. There are two plots for each 
solar rotation in which any plasma or field data were obtained. On facing 
pages, for a convenience in lining up features in the data, are found a 
plot of plasma data (temperature, density, and bulk speed) and a plot of 
field data (average magnitude, GSM B z component, and GSE latitude and longi- 
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tude angles of the average field vector). Note the 450° range in the cy- 
clic field longitude angle, employed to decrease the number of times the 
trace crosses the plot in response to small excursions in the field direc- 
tion. Note that on those rare occasions when the parameter values exceed 
the allowed range, a heavy mark is placed near the edge of the plot. For 
such cases, the reader is advised to consult the data listings in the Ap- 
pendix for appropriate numerical values. 

In a separately bound Appendix to this Interplanetary Medium Data 
Book are found listings of selected hourly parameters, which include plas- 
ma temperature (in units of 1000 °K) , proton density (cm” 3 ), bulk speed 
(km/s), and an identifier of the spacecraft from which the plasma data 
were taken. Also found with the plasma data are the field parameters: 
average magnitude, GSM cartesian components, latitude and longitude angles 
of the vector made up of the average GSE field components, the previously 
discussed vector standard deviation, and an identifier of the IMF space- 
craft. Note that to economize space, one-character alphabetic spacecraft 
identifiers have been used (as in this document's predecessor. Interplane- 
tary Magnetic Field Data Book ) although numeric identifiers are used on 
the magnetic tape for convenience. (See Table 1 for definitions of the 
identifiers.) Also note that the data are listed in 1-day blocks and that 
days with no field or plasma data are omitted from the listings . 
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INTENSITY VERSUS TIME PROFILES 


The following pages contain profiles of interplanetary plasma and 
magnetic field data covering the time period November 27, 1963, to 
December 30, 1975. 
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Harvey McCombs 


Next is Mike Gaus from the National Science Foundation. 

Mike Gaus, National Science Foundation: 

First of all, I represent an organization that does not have a direct 
operational responsibility for generating programs or getting them disseminated. 
Secondly, because of the nature of our activities, we have a large number of 
grants which provide support to university researchers for study of a large 
number of different subjects. As a result of our experience, I have come to 
the conclusion that there is a large need for computer programs of varying 
complexity and not just those built for large computing systems. I think it is 
rather interesting that either this morning or yesterday I was talking to some- 
one on the phone and the question was raised as to how many civil engineers use 
NAS TRAN. So, I got to thinking, well if you visualize the average civil 
engineering office, it is usually an office which contains, on the order of 14 
to 16 engineers, most of them 'probably , at best, ad hoc programmers, much less 
software engineers. One may come to the conclusion that probably not many 
I civil engineers do use NASTRAN because the typical kind of problem that they 
are involved with is more of the garden variety, small building, or small 
project of some kind, and NASTRAN would be completely inappropriate for their 
particular application. There is a serious problem in generating and transfer- 
ring software suitable for an industrial group which is as disaggregated as 
civil engineering design and construction. A similar situation would be found 
in the mechanical design area. So X guess a good part of my interest is in 
a rather diversified group of people. 
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In. connection 'with large scale computing systems, I think that there is 
a big job to be done in improving dissemination, verification, certification 
and so forth. However, the less sophisticated user and the large number of 
less sophisticated programs also need attention because there seems to be a 
large amount, of difficulty in transferring such programs from the originator 
to potential users. With-in the National Science Foundation, computer-related 
activities' are carried on in several divisions. The Division of Mathematical 
and Computer Sciences provides support in what I would call software science. 

The Division of Engineering, RANN, and- other programs provide support which 
often generates programs related to the solution of specific problem areas. To . 
better understand the problems involved in transferring software from 
originators to users, we have supported a number of special studies over the 
last several years. Among these were the Workshop on Engineering .Software 
Coordination in 1972, the ASCE Report on an Investigation of the Feasibility 
of Establishing a National Civil Engineering Software Center 1973, a study on 
the Attitudes Toward Computer Software and its Exchange in the Pressure Vessel 
Industry in 1973, a report on Industrial Engineering Software Library in 1973, 
and a study, by CEPA resulting in a report entitled a Proposal for a National 
Institute for Computers in Engineering in 1975-. 

The overall problem is not inconsequential. In 1971 the General Accounting 
Office made a study in which they estimated that the investment by the Federal 

i 

government in engineering-related software approaches something on the order 
of 2 billion dollars per year. The question is, having made this investment, 
how can the public get their hands on the large amount of software for which 
they hav- already paid for. 
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The construction people took a look at this particular problem and they 
concluded that their difficulty is the fragmented nature of their industry. 

As pointed out, often we deal with a small office and we need some means for 
getting hoth large systems transferred to these people and to allow them to 
get access to smaller software systems which they may not know about or may 

, 1 

not be able to use even if they find out about them. The conclusion was -we 
need some type of software center for the construction industry. We had a 
study made by the industrial engineers and they reached roughly the same con- 
clusions except that their approach was a little different. The industrial 
engineers approach was that they felt that they may be at the .stage of develop- 
ment where, rather than an actual software center for doing this, what they need 
was an information system to allow them to identify all the various software 
in their area and related areas which they can get their hands on. 

The CEPA report which Steve Eenves showed you slides of also indicated 
the magnitude of the problem in just the one area that they looked at. I 
think- they identified on the order of 11, ,000 programs which are kicking around 
the country related to their area. Now if you multiply this by the number of 
engineering disciplines, this will indicate the difficulties from the stand- 
point of the users in trying to find the software which will serve their 
particular function. 

At the present time there are a number of active groups around the country 
who are engaged in the dissemination of software; one was mentioned by Jim, 
COSMIC, which is doing an excellent job in the Aerospace area. There is one 
which we have been involved with more or less for some years. And this was 
an effort that was set up in connection with the Earthquake Engineering Program 
the foundation supports. A National Information Service for Earthquake 
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Engineering was set up as a part of this back in 1973, and they decided to 
experiment with making available software which would be required for 
carrying out structural dynamic analysis related to earthquake effects. This 
center now distributes software packages primarily prepared by universities, 
but there are some other people who donated various packages to the center 
which they had developed but wanted to make them available to other people. 

More recently they have taken somewhat of a more active approach and they are 
trying to take some of their packages and modify them so that they will run on 
a wider variety of machines. In this way they make software available to all 
who are primarily involved in structural dynamic analysis related to earthquakes. 
There still exists a lack of verification, certification type of activity for 
these programs. 

* 

There is a great need to complete the loop, and Harry Schaeffer, who is 
sitting in the front row here, has been very active in pointing this out, that 
in order to have a successful dissemination program, you really have to have 
a feedback route to get some feedback from the user who will get this software. 

j 

In conclusion, there is a question I would like to throw out and I would 
like to get some discussion going on it, this is that there is not only the 
documentation, standardization, and so forth of giant multipurpose programs for 
which I think we are doing a rather excellent job but, in order to look at the 
broader aspect of dissemination and technology transfer, we must also look at 
a diverse group of people who make up the present community of software users. 
This is the end of my presentation. Thank you. 
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Harvey McComb : 


Thank you Mike. The audience has sat there very patiently while we 
unloaded all this stuff on you. So we are going to take a break in Just 
a minute while I make two announcements. In order to provide large space 
for some of the technical sessions, there may be a relocation for some 
of the sessions tomorrow. They still will be in Marvin Center. But you 
are urged to check the registration area on the fourth floor lobby for 
any posted changes. 
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PART II AUDIENCE RESPONSE AND DISCUSSION 


Harvey McCombs 

I open up the meeting for discussions now. If anybody in the audience has 
any questions or comments or remarks, if you would, please state your name and 

y 

affiliation, we would appreciate it. 

Dr. Bodhe, Stone & Webster Engineering Corporation: 

We are involved in nuclear power business and we are users of very large 
systems such as STAKDYNE, ICES-STRUDL, SAP, and so on. There are two or three 
questions I have in mind; one is the full implication of certification and 
verification. What do you mean by that in actual terms? Have you considered the. 
cos.t involved in that? A typical example is taking ICES-STRUDL, which happens 
to be one of the dynamic systems. We are testing a structure with' several hundred 
joints whose analysis consumed about 13 hours of our computer and cost 15 thousand 
dollars to run it after the bugs are out. So, how are you going to pay for this, 
and who's going to pay for this? How are you going to certify programs like this? 
How are we going to certify the people who are going to use this? And the second 
thing is, are we serving the industry by having this kind of certification, veri- 
fication, and what purpose are we achieving? 
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Bob Nickell:' 


The question is directed toward certification of software and people, and 
the question had like three parts to it. How can we justify the cost’, and who’s * 
going to pay for certifying programs and people, and what purpose does it serve 
to certify people? I think those were the three parts of the question. And the 
way the question was asked, implied that there is someone up here who was 
advocating certification of people and of programs, and I don’t believe that is 
the case. We are trying to find out from you whether or not the user community 
thinks that it would be desirable. My comments were aimed at trying to define 
what certification involved. 

Ok, here’s what the costs are like. A typical program for certification 
such as in the case of Nuclear Class Welders is extremely expensive. We are 
talking about thousands of dollars per individual. I do not think that cer- 
tifying computer program users is going to be any cheaper because you are 
going to have to send them off to short courses, you are going to have to 
send them back to school, you are going to have to conduct in-house training 
programs, and all that kind of overhead items. It is going to be very expensive. 

What about the cost for programs? I do not think anybody ever estimated what it 
takes to certify a program; I do not think it is very desirable in my own 
personal opinion. What purpose does it serve? If it serves the purpose of protecting 
the public safety, and it was absolutely required, it is going to be done and 
somebody is going to -pay for it. In the case of certification of Nuclear Class 
Welders, it is paid for by the company itself, who tries to achieve the goal that 
the Nuclear Class Code stands. It is not paid for by an external agency. It 
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is an in-house expense, so to speak. .Anybody else want to comment? 

Ralph Miller, Boeing Commercial Airplane Company: 

I would like to just make sort of three statements about the contexts of 
what has been said, and I would like to cite as a reference to my remarks our 
experience as engineering users that utilize three CDC 6600's essentially 
24 hours a day and, in case' of structural analysis, programs such as NASTRAN 
which use about 1/10 of a 6600, day in and day out. And from that -I would 
like to say that I think the programs work veil enough. I do not believe 
there is a significant demand for program certification. I initiated that 
activity at Boeing to try to achieve that. I think it is technically 
feasible, and I do not think the cost is prohibitive, but I do not think that the 
user community and the management can pay the bill on its own. So, therefore, I 
concluded it is not required. I do think that training and usage, however, are a 
major problem. We do use black boxes now as engineers.. All of us use Roark's 
handbooks, textbooks. We do not want to think they are black boxes but they really 
are. Very few of us have ever done it or are capable of verifying what is in those 
boxes and I think the computer programs are one of these. And that the industry 
will, in fact, use them as black boxes. And the third point I would like to make 
is that I believe that the marketplace is ready for programs that are warranteed. 

By that I mean that the user can truly treat them as a black box. He can drive the 
car, he can get on an airplane and not get off at 3962 meters (30,000 feet). 
Programs do make errors, and his interest in warrantee is that his labor is not 
going to be wasted, which in turn may have an impact on the schedule of his work. 
And I think the marketplace is really right for the warrantee of programs so 
that people can put these bolts and nuts together to build the products that 
are in their domain. 
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Jay Wiley, Bechtel Power Corporation: 


Being in the nuclear business, I am also a program technical specialist, 
and my responsibility is keeping the computer programs technically correct. And 
the question of certification is very important to me. I see people doing very 
sophisticated calculations, and then going out and taking these numbers and using 
them as the gospel. Under that set of circumstances, I consider it to be ex- 
tremely dangerous because X do not think we know all there is to know about what 
happens in the real world. So I find having to have some sort of correct or 
acceptable answers is very essential, yet to do that is an extremely expensive 
process. And unless somebody comes along and says you are forced to do it, it 
will not get done, simply from an economical point of view. The question about the 
Welder Code — sometimes the Federal Regulations are basically an implication of 
quality assurance requirements. The same federal code could he applied to computer 
programs also; the government would have to simply say this is part of the standard 
U.S. requirements for all nuclear work and then it becomes mandatory. But I do 
not see the federal agencies, specially the Nuclear Regulatory Commission having 
any guts to do that because we had an example earlier in the opening presentation 
where someone asked a question about doing automated design, and what happens if 
you use the code and the man said that from a basically legal point of view, if he 
does it by the code, that is his defense if it falls down. He follows the code 
which is considered standard acceptance. If the Nuclear Regulatory Commission comes 
along and said I certify the STRUDL program, and I go off and, use this STRUDL pro- 
gram and something falls apart, X may come back and say I did what you told me to 
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do; it is not my fault. And I would like the panel's comments with regards to 
this particular type of problem. 

Bob Nickell; 

I guess I draw that assignment again. Let’s see if we can recap what consti- 
tutes the question before the house here. As I understand it, you are wondering 
whether or not the federal agencies really have the nerve to impose such a stan- 
dard through the Code of Federal Regulations, and if they will do so. And I will 
say that the comments that I excerpted about computer programs, the classes A, B, 
and C are, in fact, excerpted from the Code of Federal Regulations. It is an 
amendment, shall we say; it is 10 C FR 50. And I see no reason why NRC would not, 
in fact, write an additional amendment for' computer program user certification, 
but they will not do so if the ASME Boiler and Pressure Vessel Code and other 
Standard Organizations do not see a similar need. So X think that we have to 
have more representation in the code committees by people who are software users, 
not just those people who sign stress reports and who are registered, but the 
actual people down on the floor who are doing the computer program calculations. 

We still have not made our point to the code bodies. And' I do not believe that we 
are still making our point very clearly even to the Nuclear Regulatory Commission. 
And all I can suggest to you is to keep pushing. They are getting the word, both 
from the front door and the back. And in another couple of years I have a feeling 
something is going to move. 

Anonymous ; 

1 know I just had a conversation with a fellow just the other day who is 
working on what he calls verification reports for a program. And he went to an 
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unnamed individual in the Nuclear Regulatory Commissions 's technical staff and 
said: "We have this thing and we would like you to review it," And this man 

responded by saying, "Well, until you prove that a sufficient portion of the 
industry is using this thing, we are not going to bother to do it." And that 
is what I mean: they are not going to do it. As far as the people in manage- 
ment, you come along and tell somebody you are going to ’spend $50,000 running 
simplistic set of test cases on a major software system, you are not going to 
find management too excited about doing that either because eventually that will 
be paid for by the client, and the client will charge the public and so forth. 

It all ends up back to the guy who paid the ultimate bill. And I see we come 
in here and we talk about this thing every couple of years and we do not move 
very far. You are correct, as far as I am concerned, that there is a law 
already on the books that says you have to do’ it. Since then everyone is con- 
veniently avoiding the issue. 

Bob Nickell: 

Let's make sure we differentiate between the inability of a NRC man to find 
the time to review your particular verification process and somebody who is 
actually interested in it. He probably is quite interested in it, but he does not 
have the time. Those people get swamped with so much paper work that it is 
unbelievable. I think that the proper place to work again is through a professional 
society, and through Standards Organization, through the ASME Boiler and Pressure 
Vessel Code, not so much through the NRC people. 

N, Krishnamurthy , Vanderbilt University: 

My remarks are addressed to Dr, Fenves. X am sure that a man of Dr. Fenves' 
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experience and stature has thought his remarks through very carefully. I must 
believe it's just a matter of semantics. It intrigues me, however, that he used 
the phrase "ad hoc programmers" nine times and added the adverb "unfortunately" 
three times and to me it sounded as if, and I want his clarification on this, it 
sounded as if he lumped most of the university professors who have come through 
the ranks and almost learned for themselves the crafts of programming and suffered 
through and published some and then went off to conferences, and their graduate 
students go through the same way and they make mistakes and do make inefficient 
programs but somehow in the end they seem to learn from it; and, I may also add 
that most of the programs that are popularly available today seem to have been 
produced by graduate students who sometimes have been ad hoc programmers in their 
own right; so, I would like Dr. Fenves to expand on these remarks and tell us, and 
I am sure other professors and faculty members would be curious , if he considered 
any of this process as inadequate or if he only referred to jumping into the ring 
with inadequate preparation or inadequate or limited experience. 

Steven Fenves: 

That is quite a challenge! I can only refer to this conference. This 
morning I was sitting in the back and listening to the papers. The papers are 
all uniform. Most of them have 20 slides, 10 slides of matrix derivations, problem 
formulation, the speaker clears his throat, and the 11th slide comes up, beautiful 
computer plotted output. Nothing is said about the efforts of six months, nine 
months or one year period from the time the last equation is written down and the 
first tentative, plotted output, not even about the dump output, is produced. That 
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is the area of software engineering. The point that I try to make is that there 
are tools, disciplines, and mechanisms. The panel has mentioned many of these 
which are available for this process. For many of the small processes that we are 
talking about, it is not worth while applying them to that. Even the chart from 
the Bureau of Standards shows that the degree of documentation in the guidelines 
is a function of the cost in manpower effort. So, X think I will stand by my 
statement that most people at universities are in this area which, by the definition 
of the slide, would not require any documentation, and in fact there is not much 
documentation on it. It does not require any formal tools, data structuring, and 
so on, beyond that solving of problems and the programs; in fact, we do not have any 
thing more than that. A number of people, you, your colleagues at Vanderbilt, lot 
of other people have gone beyond that, and have evolved computer programs which 
through one way or another have gotten into practice. The experience has been, that 
these are quite difficult to maintain in most cases, and that in many of the large 
organizations, McDonnell Automation, for example, do not use the programs as coining 
out of research. They only use algorithms and descriptions and then manufacture 
from them the products they want. So I do not think I have to take anything back. 
And the other thing I like to point out in terms of what you said about change: Yes, 

there is change coming; our students do go over and take courses in numerical 
analysis, do take courses in Data Structures and so on, and software engineering in 
their work and it does show up. The thing that we learned on our own to survive on 
earlier machines, the students now learned it in a formal discipline. And I hope 
that this middle layer of ad hoc programmers eventually disappears. But there will 
not be a need of reinventing the wheel and redoing a lot of ad hoc software in order 
to formulate and solve the type of problems that have been discussed at many of the 
sessions at the Symposium. 
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N. Krishnamurthy: 


Now I agree with you because you have explained the intermediate layer of 
training and discipline that we did not have available ten or fifteen years ago. 

So - I tend to agree that if this is true, if professor and students do not take this 
initiative and opportunity and try to start from the do-loop stage of computer 
programming, then he is in for trouble. But I was wondering about the lack of 
explanation of this and now I think I got the idea. I am sure many of my colleagues 
insist upon the Students going through this discipline and although the professor 
might not have gone through these courses himself, would learn by reflected exper- 
ience. And now we do not have to insist on it. 

John Hendrick, FMC Corporation: 

I am a practicing engineer. X do not go too much with theoretical figures and 
I'd like to direct a question to Mr. Johnson. You made a comment that struck a 
tender nerve. You said something about implementation of computer programs by the 
engineer. Now our company is not large enough to employ a lot of computer pro- 
grammers so I am at the mercy of the universities to create the programs that I 
need for me. Quite often I am called upon to go through the literature, find the 
program that I really like, and then try to implement it. It is a real difficult 
problem sometimes and requires an awful lot of manpower. I would like your com- 
ments on how we can go around this "National Society of Computer Implementators" 
or whatever you might want to call it. 
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Jim Johnson: 


I appreciate your problem because sometimes poeple think that I am the 
"National Society of Implementation" as well. I gave you the illustration of the 
lieutenant that gave me a call Friday, but the week before that I got a particular 
question about NAS TRAN, where a captain called me from the West Coast and wanted to 
know why is it that he cannot use the output 2 file which was generated by NAS TRAN. 
He was an engineer and output 2 file meant nothing to him. He probably would not 
even understand the concept of a file being processed by a computer, and did not 

L 

know what ^ installation-defined defaults are, which could affect the formation of the 
data’ on that file. So we gave him no help since we had nothing to work with other 
than to tell him to go back and take three steps back and start from the beginning. 

The direct answer to your question: Unless you are an ad hoc engineer, I 

strongly urge that you do not try to implement the computer programs of any sig- 
nificant size and scale. First of all, it is a useless waste of resources and man- 
power. Most installations do have professional software engineers, that is what 
they may call them now, but we use to call them at Wright -Patterson System Analysts, 
Computer System Analysts, Software Engineers, anything you want to. If the program 
is of any significant magnitude in terms of complexity and size, the most important 
job of the engineers is to get out of the loop during the implementation. And that 
includes you and me. Ok, does that answer your question completely? 

John Hendrick: 

. As far as you are concerned it does' answer the question, but as far as I am 
concerned it does not, because I am still in the same predicament that I was ten 
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minutes ago. 


Jim Johnson: 

I have a solution for you. You have heard papers at this conference pre- 
sented by what I like to think of as data service companies or organizations. 

Most of the major computer manufacturers do provide customer services other than 
hardware acquisition, and the universities are a very good place to look for 
assistance in implementation. I am sure the rates have changed, but one could pay 
a student at least a $1.50 an hour to put up a program on the machine. This 
would be an excellent avenue for technical assistance on implementation of com- 
puter programs. It not only, provides some funds for the graduate students that need 

some financial assistance, but it does provide a training ground and some 

l 

educational opportunities for them. Now if you are one of a more prosperous firms, 
you can go directly to a Software Services Center provided by specialized organi- 
sations like SDRC, or EMRC, or MARC, or similar places, or you could go to the Computer 
Manufacturers themselves , but then you may not be a Boeing or McDonnell or something 
like that. But maybe you can pay the high rates. But certainly if you are small 
as a firm, the universities are an excellent place to get implementation assistance. 

I should caution you though, that in the university environments, the necessity and 
desire are always there for experimentation and these graduate students can fix it 
so you will always have to come back to them. 

John Swanson, Swanson Analysis: 

I think I have heard a theme throughout the conference both in the keynote 
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address and the panel, today, that the user has become much more sophisticated. 

1 think I would echo that, and I think I would like to state in the area of veri- 
fication, for example, that if the user wants verification he is going to get it. 
The same as if the user wants documentation he is going to get it. And I think 
that what we are beginning to see is the user saying I want this; and, because, as 
some of you may have noticed, since there is some competition going on in the soft- 
ware industry, the user is in the position, to say what he wants, and he is quite 
likely to get it. I am not sure how this applies to the Public Domain Programs, 
especially in the area of verification of the Public Domain Programs. Who veri- 
fies a Public Domain Program and guarantees that it stays verified? Also, the 
comments on distribution referred to a good point that distributed programs have to 
be updated, but who would guarantee that the distributed programs are updated? You 
have done your job distributing the update, but it never gets into the program. 
There are a lot of questions here, but I think there has been a change and the user 
now has a lot more say in what he wants to happen. 1 am sorry to say that we are 
finally getting to publishing the Verification Manual and it is not because of the 
U.S. users; it is because of the European users. The Europeans are much more 
insistent on seeing good verification on a computer program. 

Bob Pulton, NASA Langley Research Center; 

We have heard a lot of discussion about dissemination and documentation, and 
it seems to me that we find ourselves with two types of programs that we are con- 
cerned about. The first is the research oriented program, which is developed 
by graduate students or by some of us for our own personal use. It seems to me we 
could dispense with discussing such research programs since they are intended 
primarily for the individual and his own needs, not of general concern to the 
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the broad community. We should focus our attention on the second type of 
program, those developed for distribution to others. For programs to be dis- 
tributed to others, these are also of two categories: those already developed 

and those to be developed in the future. Most of our discussion seems to be 
focused on trying to collect and/or clean- up existing programs , yet it seems to 
me there are limitations on what we can do with existing programs. We are best 
served to implement procedures to ensure that future programs are well developed. 
The discussion from Boeing dealt with how to manage the process to ensure that 
future programs are documented from their very inception. So that, to quote 
Boeing, "Coding is the last state of documentation." I dare say that most of 
us in this audience, when we have developed codes, have often written the code 
and then documented it. It seems to me that our attention should be first to 
close the barn door, and second,. to look around at the horses outside, select 
the high quality ones and do something with some of them to recoup the resources 
we have invested in them. But we must work together to implement business like 
programming procedures which first develop documentation and second code, not 
vice-versa, as Steve said most ad hoc programmers seem to do. 

Jim Raney, NASA Johnson Space Center; 

I am one of those things that you all have been trying to define for several 
days around here as a software engineer. I am sort of out of place in this con- 
ference I guess. I am not an ad hoc programmer; I do not claim to be and never was 
(I interned for a few months as a graduate student, of course). '(■That my group has 
to do is to build programs or systems (we prefer to call them) for the engineer or 
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the group of 'engineers at our facility, that are willing to take the time. Now 
this is my primary point, that it Is going to cost you. something up front to get 
that barn door closed. You got to be willing to spend the time with us to define 
what it is you want to develop . Write down or help us write it down whatever it 
takes to define the bam that you are in. Then we will help you close the door, 
build the system that it requires on the computer to do the job that you want to 
do, document it as we go. But the trouble that we have with our people, even 
today, the good engineers are too busy many times to help us do that particular thing. 
And I think that if you really are serious about computer program construction 
in such a way that you will have a viable product that is disseminateable (if there 
is such a thing) . That is the price you are going to have to pay. 

Jim Johnson: 

John Hendrick, of Santa" Clara, directed a question to me, and X do not think 
I gave him the complete answer, although it was contained in my opening, discussion. 

Bob Fulton reminded me of it in his comments. We are talking about large engineering 
software systems. First and foremost, to use one of these systems, one must maintain 
a certain level of capability within his firm. My estimate is that it takes two to 
five people in the company to maintain a large engineering software system, and that 
does not include maintenance and modification. That is just keeping it going in 
your house for someone else to do the maintenance and modification. So it is ex- 
pensive. First of all there is an overhead cost, and as a customer user, to maintain 
a large system in-house. Ok, and I do not think that is what you were asking me in 
the question. Let us make the point that any large system for in-house use requires 
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in-house support, and that is not maintenance and modification, but just keeping 
it running. The same group would implement any changes that are sent for updating 
or enhancements. That is once they are defined and shipped to you, someone has to 
enter these changes in the master code media. That is not maintenance. That is 
just updating your version. And this is why a current list must be maintained of 
all people, old and new, that receive your programs. Now, when you think of the 
cost of a professional, and you are talking about two to five people, you are cer- 
tainly talking about $30,000 - 150,000 a year, depending upon the program just to 
maintain it in-house. 

Anonymous : 

Two comments I would like to make. One essentially is on the aspect of por- 
table software. Our experience at the Bureau of Standards, essentially working 
with other federal agencies in this regard, is that unless portability is one of 
the initial design requirements of the software in the first place, we usually find 
that software developed for a particular application on a particular computer is only 
transportable among other computers of that product line. And if we are talking 
about true portability of software, essentially that has to be in the early design 
specification of that software. I think there is another alternative in this aspect, 
that as you develop software particular for your own internal use, you may hit upon 
something that you will find to be useful to others, and you want to share that. At 
that point of time you essentially inherit a different set of problems, in that you 
have to go back, redocument,, and also call upon some technical competence essentially 
with experience in computer systems and their peculiarity. In order to tailor what 
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you have developed to try to make it portable, this involves another economic con- 
sideration that has to be taken into account in the aspect of sharing software. 

Looking over the several years now that we have shared experience, both from the ADP 
standpoint and the user's standpoint, particularly in this area of application 
engineering where the users are becoming more sophisticated and are able to place 
upon the computer community their requirements, what they want in terms of performance 
1 quality, and certification, I feel that we in the ADP community are now ready to 
respond to that type of user specification. And I feel that it is only good for our 
industry and also good from the standpoint of the professions, other than the com- 
puter profession, that use computer services, that we note start to bound our com- 
puter services, our computer profession, with the type of requirements that only 
you are capable of laying upon us. 

Dale Seamons, Information System Design; 

My company provides computer services and in order to be most competitive in 
the marketplace, we find that custom programming should take full advantage of some 

i 

of the unique capabilities if our hardware is required to be competitive. How do 
we play that against portability, that is an open ended question that I would like 
to direct to anyone on the panel who may care to respond. 

Bob Mickell: 

I would like to make a brief comment. I concur; we find that, at our company, 
custom programming for a particular computer is ideal as well.. We have nothing but 
CDC equipment on site. But when we transfer our programs to other computers, IBM, ox 
UNIVAC or some other system, we find that a complete recoding of certain modules is 
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much, easier than trying to make the program portable to begin with. 


Anonymous ; 

i 

Just to ask some general questions back at you. What percent of your code is 
really required to be machine dependent, and is it really necessary to architecture 
the software so that it is dependent on that machine? Is there really an advantage 
in doing that? Mainly we see that we can devise an architecture that is portable, 
which eliminates the large amount of portability problem. Secondly, some small 
percent of the code is generally machine dependent, which is generally there for 
the purpose of gaining efficiency. Could we not then design the software in such 
a way that we isolate that part of the code and hence then recode that part to make 
it portable? In other words, if we design the portability in the beginning by 
recognizing the need of it, and then avoid characteristics which are not of essen- 
tial nature to the program, but which are only individual habits, for example, why 
use 10 H on 6600 machines when it is so dam difficult to get it over to IBM ma- 
chines? Why not use some other format? This may solve the problem. 

Dale Seamons : 

I am aware of the trivial portable features, but what I am talking about are 
the really major architectural considerations for a big program. You mentioned 6600 
or perhaps 7600 which has low-speed core, high-speed core, and you can take advantage 
of these system features. Similar features are new concepts on structuring the core, 
the way I see it, and almost dictate in many cases independent versions of programming. 
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Bob Nickell: 


I think in those instances, that you make a deliberate decision to create code 
so as not to be portable. 

Mike Gaus: 

X think this really brings us back to a point that someone up here mentioned 
a little while ago. That is what the incentives will be for portability and if you 
are in the business of making custom software for a particular installation or cus- 
tomer, I would not think that there will be a big incentive to make it portable unless 
you could sell this to many other people. So this is one kind of situation. You 
have another kind of situation if you are looking at it from the standpoint of 
public domain software in which it is a governmental or federal investment being 
made. There the incentives for portability will be much stronger than the case of 
custom type installation. I think that the motivation might be different depending 
on what the viewpoint is. 

Anonymous, ESA: 

We have had a recent experience with a portable program that we had obtained 
from GDC. The CDC version ran on our machine for 96 hours. On an ,IBM 370/168, it 
took 36 hours of computer time just working on this particular job. The same pro- 
gram with some modification, $5000 tforth of my work, ran in 5 hours. Since we are 
also in a competitive- business , then why charge a customer $15,000 if it could run 
in $5000? Therefore, I would say portability at extremely high price is not a 
desired feature. 
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Jim Johnson: 


I think now we are discussing some of the subtle problems facing our panel 
rather than the governing criteria. I would like to use this example to try to 
drive the message home. Those of you who have used CDC machines and have had 
some experience in SCOPE 3.2 and with SCOPE 3.3 with your COMPASS coding should 
know what customizing does for you. Because you started from scratch again and 
every time they change the SCOPE system, the tendency is to make it more difficult 
to upgrade to the new system to take advantage of it. I know of one program that 
went around this problem by building a special system within it, and every time you 
get the executable, you get the same system. And that is what Stan was talking 
about in the beginning. In your product development, insure that your specifications 
cover some potential growth or problems in the system. I think what we are talking 
about is whether you can build- schemes for customizing at the product definition 
stage rather than downstream. I think what I would like to see addressed from the 
Panel is whether Stan or someone can define or isolate the useful life of large 
engineering software systems. You know, when you stop maintaining them, how do 
you amortize them over a period of time? How costly the job is, or what is the 
cost to be bom. Stan, do you want to handle that? 

Harry White: 

I attended another seminar just a week ago, Monday and Tuesday, run by the 
Department of Defense. They have come to recognize that in many of their weapons’ 
systems, now, software is the pacing item, both from the point of view of cost, 
schedule, and reliability. And they are getting to asking questions about what 
the full life-cycle cost of software is, recognizing that very often software 
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shows up as a line in a procurement document without any requirement for a product 
to be delivered and no assessment of the maintenance cost of the software. And 
what they discover is that when the system actually goes on line, there are 65 
programmers out there necessary to keep the system running, that nobody has ever 
talked about. They have found that for the same price of the system, software 
maintenance cost runs about three times the cost of hardware maintenance. There 
is a reason for that. Software systems tend to reflect the purpose and characte- 
ristics of the organizations that build them. Hence, if the organization producing 
a software system is interested in producing doctoral degrees, they may not be very 
interested in maintainability of that system five years hence by Sandia Corporation 
or Boeing. Therefore, maintenance does not show up as a design criterion in that 
system. If a research group is building a system, they may be interested in 
addressing a new formulation or concept; hence, maintenance is not a very high 
criterion in that system, nor is portability. You will not get in the system any- 
thing that was not placed in the design initially. And it is necessary, therefore, 
that you look at all the parameters you wish to consider in the full life-cycle 
cost of concern to you. Then you better look at those costs in your initial design. 
We know now, at least the Department of Defense costs were saying last week, that 
the full cost of software appears to be about 65 percent currently going into 
formulation, development, and validation, about 35 percent going into maintenance. 
The Department of Defense has tried to estimate the cost of their software each 
year and they only know that it is in excess of 3 billion dollars each year. 

They do not know the exact amount. When they went out to get their first 
estimates and got the figures back, someone went and looked at another place and 
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found a single system that had more cost associated with it than all the systems 
they had up to that time. So they really do not know the full cost, but they 
know that maintenance runs approximately 35 percent of that cost. So these 
questions are becoming very, very important. Software systems 1 life cycles 
are now running in the order of 10 to 25 years. Of these the first 5 are spent 
to develop it, the next 5 to mature it, and then the next 5 to 15 years are 
spent in using it. Somewhere in that neighborhood is what we are looking at 
so we are moving to where we need a more professional approach to software. 

Steven Fenves: 

/ 

I would like to add some comments. The idea that- Stan raised that the soft- 
ware products are a reflection of the milieu out of which they come, I see this 
happening all the time. Most of us in aero, civil, nuclear or whatever disciplines 
are represented, here, have had our engineering training on a project Basis, where 
it was understood from the beginning that we are putting together, in a unique 
fashion, building blocks that existed before or possibly some new building blocks. 

But the emphasis has always been on projects and I think our software is de- 
veloping that way. I see my colleagues in chemical engineering,^ for example, 
operating quite differently. Their chemical engineering has always been continuously 
process-oriented, manufacturing-oriented, long-term-production oriented. Their 
programs are written quite differently and their programming endeavor is managed 
differently than what we are familiar with. I think this extends not just to the 
question of the development but it goes beyond that to the question of verification, 
and certification as well. We are in an industry where there is no single respon- 
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sibil'ity, and that again reflects itself in the software. There is verification, 
quantification, possibly even certification on the part of the software vendor 
to the designer who operates as an agent of the owner. And then there is an entirely 
different process of the certification of the designer with his software tool being 
certified by the owner or by the regulatory agency. And sometimes when we talk 
about these topics of verification, I am not sure which side we are talking about, 
the vendor delivering a product to the designer who still has the professional 
responsibility for the end design or is it the designer with his tools delivering 
the product to the owner. 

Again, if you look at some other industries where there is single manage- 
ment control and single fiscal control at the top, the problems are quite different. 
So we have to realize that we represent our own backgrounds , and tbe software we 
develop and- the software we use directly reflects our background. 

Avanti .Shroff, URS-Madigan-Prager: 

I have been sitting here for two days and there are a couple of things that 
are puzzling me in terms of what is happening. I feel there is a lag in terms of 
engineering profession and theoretical investigations and analysis. One of those 
things that I have found in my 10 or 11 years of experience, and 1 have used many 
software systems such as NASTEAN and STRUDL, is that the reason we are discouraged 
to use lots of these programs is the following: 

Analysis is an assortment of facts based on assumptions. Engi- 
neering and design are not as much a fact as they are improvization of 
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code and using the judgement and eventually coming up with something 
that is going to stand in the field. When I go out and use a soft- 
ware system X get a very interesting analysis coining out of this 
system, but then I hav-° to take these results and somehow program 
them into my own design program. This is because, so far, I have 
not found a single software package which could give me a design as 
comprehensive as most consulting engineers would like to have. Now 
when I find out that I have finished my first cycle of design I have 
to go back to these people for another round of analysis. This I 
can avoid by two means. One, if I can get these people who are deve- 
loping these various packages to give us a design which is very much 
up-to-date with respect to the code, which very rarely happens, the 
assumptions involved are very gross because the people who are writing 
these programs are not as much involved with the design processes as 
they are involved with highly theoretical analyses. 

The second alternative is to write our own process programs with effort 
of our own, which, as Dr, Fenves pointed out, is not very software 
oriented, and then recycle our own design methods into these programs 
and come out with our own answers. Now, this answers one of the com- 
ments Dr. Gaus made that you take a civil engineering office with 12 
to 14 engineers and very few people are using NASTRAN. This, perhaps, 
may cast some light on why these people are not using NASTRAN or ICES- 
STRIJDL. 
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I do not want to mention any names, but I was in a design 
seminar where one of these software companies was describing 
the design capabilities of their outfit and I asked a simple 
question as "What K value do you use for column design?” 

And he said, "Well, K is equal to 1", K meaning the slenderness 
ratio effect in column design. This is a very important effect 
if you design high-rise structures, main frames, or no matter 
what you are designing. And I said, "Have you considered the 
fact that this K value has been a very controversial feature 
in the last five years and you really should not be using a K 

value of 1?" He said, "Well, we are not involved that deeply 

>■ 
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in that kind of controversy." Yet I know of many organizations 
who will use this program and come up with the final design 
answer that uses 8WF36 for a column. If they do not know on 
what assumptions these 8WF36 have been selected, it would be 
very difficult for that person to present this to the client, 
put his signature to it, and find out later that the column 
failed in the field. 


Therefore, I feel that there is a complete lag- between these two domains and I 
hope, and I had hoped, that we are going in that direction, that some effort is made 
to somehow relate analysis processes with design processes in terms of the latest 
code revisions and everything else. This, again, points out to the fact that we axe 
discussing what is the life cycle of a software package. And I think it all leads 
together to see that all these things are put together into software packages as we go 
along in this software community. 
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Bob Nickell: 


I would like to comment that what you are addressing is the development of 
postprocessing systems that are directly related to specific code requirements. 

I would like to say this in defense of the software developers. Most software pac- 
kages are too general in terms of amount of application you can make of these programs 
to justify the developer to get into the business of furnishing a postprocessor for 
each kind of code design for which the package will be used. Therefore, we find 
that it either has to fall back on your shoulders as a user or a third party has to 
get into the act, either a service bureau or a government agency or someone like 
that, who will furnish postprocessors for specific sets of analysis packages. 

Let me give you an example. If the user community has enough muscle, they will 
force the development of postprocessing packages that will interface not only one 
but, perhaps, a whole family of analysis modules. Then your problem will be - . 
solved. But I think you have to develop a specialty clientele, in that case, so 
that if you want a Section 3 design, you are going to get a Section 3 post- 
processor that will interface with ANSYS, with MARC, or any other package you are 
using. In your case, if you are looking for a civil engineering type module, you . 
are going- to have to get together with another group of people and somehow get it done, 
either through the government or common funding. 

Stan Hansen: 

Your description of software packages reminds me of how I felt when I went through 
university training where I discovered enormous quantities of lecture material on 
analysis and hardly anything on design. And there was a reason, for that, I dis- 
covered after a while. And that is that analysis can be formalized to some large 
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degree and it tends to be objective, whereas design tends to be a very subjec- 
tive act dependent upon a great deal of vocational information relative to indi- 
vidual projects. Hence, there tends to be a reluctance to place within a hard- 
wired software system parameters which we would consider to be design and 
normally would come from the mind of the person doing the design. I think that 
we are moving towards the time when' we can see that certain design processes are 
more routine and objective in nature and can be hardwired' into software systems. 
But I think we find that they are a small set of the total set and the larger set 
is very dependent on the vocational knowledge of the people local to the project. 

Pat Horton, Control Data Corporation: 

I have several comments. First, in this business of pre- and postprocessors, 

I think this is an excellent place to advertise our programs because, quite fre- 
quently, with a 50 or 100 line program, you can get a lot out of a program provided 
you have some understanding of the program and know where to look. 

In your comment earlier, Mr. Johnson, on using Output 2, X recall that one 
time I was attempting to use some NASTRAN output and I could not find anything in 
the User's Manual directly on any of those output files. I finally wound up 
directing the output to a punch file and treating it as a BCD file rather than 
binary in order to postprocess it. This is definitely a weakness in NASTRAN 1 s 
Manual, but there is a lot to NASTRAN that is hard to find. 

On the subject of cost, it was mentioned that there is probably more invest- 
ment in software cost than there is in hardware cost. An interesting observation 
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that I stumbled upon in Datamation is that a rule of thumb in programming is a 
man-hour of work per line of source code, from the start of programming process 
until you have a debugged code. And I do not think that this includes the 
initial analyst’s work or the time to document the program. And several people 
since have told me that the above is a good number. It makes it pretty dam 
expensive. So, well-organized programming is highly desirable. Unfortunately, 

I have seen many examples of not too well -organized programs. 

On your observation that it takes 2 to 5 programmers/analysts to support a 
large package in-house, this is again often times brought on by companies who 
go out and buy a large system for a $200 copy cost and they find out that what 
they have done is to buy a black box with permission to open it up. To this 
end, many companies who really do not, but should, think about going to service 
bureaus like CDC, Cybernet, McAuto, etc., and take advantage of their bags. This 
may cost you a small amount more per CPU minute to run this way, but it avoids 
many headaches. In addition, you can usually get a warrantee from your service 
companies as to the acceptability and usability of the program. 

F. McVey, McDonnell Aircraft Company; 

«- 

There is a class of computer programs that I think would require a new form 
of documentation. We. have a computer-aided design and graphics program that serves 
our engineering division. It is an extremely large and very interactive program. 

This program is on two 370/168’ s supporting approximately seventeen 2250 terminals. 

We have those terminals operating something on the order of 14 hours a day to generate 
design data for drawings for the F-18 program. Those data are then used in a hecto- 
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graphic program. So the users do not have to know in detail what is in the code, 
but they have to know how to use the program. If the program has a lot of flexi- 
bility, there are so many options in the way the user can apply it to a problem 
that you need to have some form of training. What we do is to provide approxi- 
mately 24 hours on-line training to a scientist, and we have trained approximately 
in the order of 250 engineers . I can see that in the future, systems like this 
are used to the extent that there will be need for higher level documentation. 

This might be a video-tape lecture series, might be something involving tutorial 
on-line operations . 

Bob H ickell: 1 

I guess I would agree that there is a c uture in what we refer to, at Sandia, 
as an interactive, interrogative, preprocessor. Is that what you have in mind? 
That is what I call a level of documentation that is very useful as opposed to a 
lot of documentation that is not very useful. 

Harry White; 

I was hoping that someone on the Panel would ask a particular question, but 
since nobody has done so, I will. We hear a lot about verify, validate, qualify, 
but I have not heard much about what one verifies, validates, or qualifies against? 
You cannot validate something against which you have no standards. And you cannot 
qualify something to which you have nothing to qualify it for, nor a standard to 
measure against. 1 have had a bad experience trying to do that. We have found 
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that without software system design, we would not validate the code except by 
sampling (i.e., just run problems through it). We could not qualify software 
systems against real life injury problems without some statement as to what those 
real life injury problems are supposed to be, or for what the software system was 
built. And it led me to believe very strongly in the process I talked about 
here, that you must get your requirements and your specifications in design down 
in order to validate software against something. 


Stan Hansen: 

Your statement about sampling is exactly true. That is the way it is done. 

Verification is a sampling process by which you e: mine the theories upon which 

the particular piece of software is based, and you sample selected portions of the 

system to see whether it is in fact coded as stated. The process of qualification 

♦ 

is generally a configuration control problem and no one as yet has come up with a 
qualification program that attempts to cover the complete map of configuration 
controls that the program is capable of. In fact, all they do again is sample a- 
number of configurations for the particular package that should solve most of the 
problems that will be done in that organization. But it is definitely a sampling 
process. 

I think there is another way. We have validated programs for which we went 
into the subroutines and defined the purposes of that subroutine in an objective 
statement of what the subroutine is supposed to do; then make a statement of what 

A 
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the input parameters would need to be (that is, the parameters that are affected 
and the values of those parameters in order to exercise what that subroutine was 
supposed to do); then build those parameters up until you have a whole set of test 
cases. In this way, you have a cause and effect relationship between your testing 
and the software being tested. In so doing, we found extremely reliable software 

i 

resulting. I guess, Ralph, you would not mind me telling this that the BUCLASP 
program that was delivered to NASA Langley received no award from NASA Langley for 
the quality of that system. In some 15 months of operation, I recall, I think 
there were 15 errors. In implementing another piece of software in which some 
25,000 new statements were implemented, we went in and caused a one-to-one re- 
lationship to exist between test cases and the code being tested. -That software 
was run over- 300 executions a month from July through November with only three 
errors . 

Deene Weidman, NASA Langley Research Center: 

That method is practical, Stan, if • you have structured programming. I wish 
I had one of my slides here. It shows that, for a simple program, if you tested a 
pattern every nanosecond, it would take 15,000 years to check all the patterns in 
the program. But if you have to search the program, you cannot do it. 

With regards to INPUT 2, we are running into situations like that often where 
people have difficulty using some capabilities. We have 5300 pages of documentation 
at the moment. And we have more documents coming out. But sometimes you have 
trouble finding certain capability. And documentationwise, it is difficult to docu- 
ment all the capabilities and all the possibilities. When you have an element in 
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NASTRAN, you are not only having it for statics, you have it for dynamics, for 

i 

buckling, vibrations, etc so that it is a problem for maintenance to have a 
program that anyone can have, that anyone can use. Jim is perfectly right that 
you have to have some staff who will keep running it for you. X can assure you 
that a program that runs now, in six months it will not. Not for any errors in 

the program, but because the system is changed. And we have many cases where we 

\ 

have spent many hours correcting a code to run under a new operating system. 

Harvey McComb: 

It is beginning to ,get around 5 o'clock. I would like to take one or two 
more questions. 

David Perlmutter, COMSAT Laboratories: 

We have a sort of a unique situation, and I am curious to find out if some- 
body else has the same situation. We discussed this afternoon having a staff to 
maintain a fa'irly large program. But we have NASTRAN, BANDIT, and a few other pre- 
and postprocessing programs that go with NASTRAN. We have absolutely no staff to 
support it. The engineering staff is struggling through NASTRAN trying to keep 
alive. If anybody else has this situation, I want to know how to remedy it. I 
am assuming that we are talking about a high priority engineering staff that runs 
around and buys large programs. Just last week I saw a demonstration upstairs in 
the labs of a half million dollar circuit design system that somebody bought but 
nobody knows how to use or what it is good for. We are fairly unique in that we 
have fairly large programs, and a fairly extensive library of them, and regular staff 
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to support it. We have a computer center. We might do well to destroy it and pick 
up an outside service, but we do have a computer center and they are fairly insis- 
tent on staying in-house. We have a weird situation. 

Jim Johns on t 

Maybe I should answer your question with another question. Who runs your 
shop, the Computer Center or the engineers? 

David Perlmutter; 

The engineers. 

Jim Johnson: 

Then you should he able to get some programming assistance on your side of 
the house, some scientific computing people.. Without that, your programs are going 
to die . 

Anonymous : 

One point that a lot of y.ou are missing is that there are two kinds of main- 
tenance. There is one kind to keep the program alive, the other kind to keep the 
engineer happy. Then you have got to have computer professionals to keep the big 
programs alive. And the way to do the other one is to realize that an engineer is 
dynamic, if he is doing his job right, and .he must have a dyn ami c programming 
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support to get it going. And I think that is where our problem is. We have 
got to admit that he is dynamic, and we have got to treat him as a dynamic 
quantity and go forth. 

Harold Conner, Portland Cement Association: 

This late time is an inappropriate time to bring up a controversial point, 
hut I would like to just say that many of the problems that have been discussed 
here today, I think, have been addressed by this study of the National Institute 
for Computers in Engineering. I feel that if we continue to go in that direction, 
we will eliminate many of these problems. 

Harvey McComb: ■ ' 

I am sure Mike Gaus, or Dave Schelling, or somebody will be glad to hand out 
some reports on that NICE study. 

Mike Gaus: 

I gave them all away. 

Harvey McComb: 

I guess that we perhaps wrap up our biennial discussion of this topic. I 
found that this has been very illuminating to me, and I hope that all of you have 
also gained something from the discussion. 

There are two things that: I want to do before we wrap it up and the first is 
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that Nick Perrone has an announcement about his program this evening. Nick. 

Nick Perrone: 

I cannot help it also to add an anticlimatic footnote. In listening to some 
of these comments, it seems that this session might be labelled "The Confession Of 
A Frustrated Engineer, or Everything You Wanted To Know About Software But Were 
Afraid To Ask." But, perhaps, things have not really changed. I think that we have 
addressed more questions and suggested more problems than answers. I think that is 
healthy; that is a reason for a symposium. But I think there are still some major 
initiatives that would be welcomed and I think they are alluded to here. 

Harvey Me Comb : 

I would like to express my appreciation to the Panel members who have spent 
some time preparing these presentations for you. I would also like to express my 
appreciation to the audience who stuck with us here and contributed a great deal to 
our discussion. Thank you and good evening I 
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